***************************************************************************
* Project: Measuring Media Freedom: An Item Response Theory Analysis      *
* Authors: Jonathan A. Solis (jsolis@aiddata.wm.edu) & Philip D. Waggoner *
* Journal: British Journal of Political Science (2020)                    *
* Task: Replicate table 2 in BJPS manuscript                              *
* Date: 01/27/2020                                                        *
* Software: Stata 15.1                                                    *
***************************************************************************

****************************************************************************************
* Note: -We obtained the original dataset and Stata .do file of Egorov et al.'s (2009) *
* 		 "Why Resource-poor Dictators  Allow Freer Media" and merged our new MSF point *
*        estimates and standard deviations                                             *
*       -Dataset obtained via email from lead author Georgy Egorov Sept 21, 2017       *
*       -Following Egorov et al.'s research design, we take the lead of the MSF data   *
****************************************************************************************

*Load Egorov et al. (2009) dataset merged w/ MSF data
clear all
cd ""
use bjps_msf_rep-egorov
set more off

*Set up analysis
xtset cow year

***********************************
*---------------------------------*
***** TABLE 2, BJPS Manuscript*****
*---------------------------------*
***********************************

*Model 1, BJPS manuscript (Table 2)  
* -> original Egorov et al., model 2 in table 1 (pg. 658)
* -> syntax "gen byte m2=e(sample)" creates variable to identify sample
*    used in this model; we use this to create a comparable sample with
*    the new MSF data models
xtreg mflead logoilresvalue logoilresvalue_polity polity2 loggdppcppp  ///
 logpoptotal loggovgdp _Iyear*  if year>=1992 , fe
*Identify sample
gen byte m2=e(sample)

*-> Other relevant values
*R-sq= .1204
estat ic
*AIC= 13028.35

*Model 2, BJPS manuscript (Table 2)
* -> original Egorov et al., model 2 in table 1 (pg. 658) but replace Freedom 
*    House data w/ MSF point estimates
* -> syntax "& m2==1" ensures we analyze a comparable sample w/ new MSF data
xtreg msflead logoilresvalue logoilresvalue_polity polity2 loggdppcppp ///
  logpoptotal loggovgdp _Iyear*  if year>=1992 & m2==1, fe 

*-> Other relevant values
*R-sq= .277
estat ic
*AIC= -6749.84

*Model 3, BJPS manuscript (Table 2) 
* -> see Monte Carlo Simulations section below

*Model 4, BJPS manuscript  
* -> original Egorov et al., model 5 in table 1 (pg. 658)
* -> syntax "gen byte m5=e(sample)" creates variable to identify sample
*    used in this model; we use this to create a comparable sample with
*    the new MSF data models
xtreg mflead logoilresvalue loggdppcppp logpoptotal loggovgdp _Iyear* ///
 if year>=1992 &  polity292<6, fe 
*Identify sample
gen byte m5=e(sample)

*-> Other relevant values
*R-sq= .0483
estat ic
*AIC= 6363.231

*Model 5, BJPS manuscript (Table 2)
* -> original Egorov et al., model 5 in table 1 (pg. 658) but replace Freedom 
*    House data w/ MSF point estimates
* -> syntax "& m5==1" ensures we analyze a comparable sample w/ new MSF data
xtreg msflead logoilresvalue loggdppcppp logpoptotal loggovgdp _Iyear* ///
  if year>=1992 &  polity292<6 & m5==1, fe 

*-> Other relevant values  
*R-sq=.2123
estat ic
*AIC=-2718.859

*Model 6, BJPS manuscript (Table 2) 
* -> see Monte Carlo Simulations section below

*************************************************************
***Monte Carlo SIMULATIONS, BJPS Manuscript (models 3 & 6)***
*************************************************************

****************************************************************************************
* Note: -To execute the Monte Carlo Simulations, we clear and reload original dataset  *
*       -Monte Carlo procedure requires us to generate new data (750 random draws      *
*        of MSF lead distribution) for each model                                      *
*       -Code repurposed from http://www.unified-democracy-scores.org/example.html    *
****************************************************************************************

*Model 3, BJPS manuscript (Table 2) 
* -> original Egorov et al., model 2 in table 1 (pg. 658) but replace Freedom 
*    House data w/ MSF point estimates
* -> runs model 750 times with values randomly pulled from the MSF leads distribution
*    in each of the 750 models 

*Load Data
clear all
cd " "
use bjps_msf_rep-egorov
set more off

*Set data for panel analysis
xtset cow year

*Run model to mark observations in Egorov et al.'s original sample for comparability
quietly xtreg mflead logoilresvalue logoilresvalue_polity polity2 loggdppcppp  ///
 logpoptotal loggovgdp _Iyear*  if year>=1992, fe
*Identify sample
gen byte m2=e(sample)

*Generate 750 random draws from MSF lead distribution *
* * * * * * * * * * * * * * * * * * * * * * * * * * * *

*Set seed for reproducibility
set seed 157782

*Loop for 750 draws
forvalues i = 1/750 {
gen x`i'=rnormal(msflead,sdlead)
}
*

*****************************
**** Prep for the Monte Carlo
*****************************

      set more off
      set matsize 750
	  
      *** Note that the matsize must be at least as large as the larger
      *** dimension in your posterior matrix. Since we use only 750 draws,
	  *** setting the matsize to 750 is sufficient for Stata IC, MP, and SE.
	  *** Executing a model with more draws where the dimensions rise above 800
	  *** will require a Stata more powerful than IC.

      *** Run the Monte Carlo
      forvalues i = 1/750 {
      *** Print out an iteration number
      display `i'

      *** Fit the model, using the ith draw from the MSF lead posterior
	  quietly xtreg x`i' logoilresvalue logoilresvalue_polity polity2 loggdppcppp ///
      logpoptotal loggovgdp _Iyear*  if year>=1992 & m2==1, fe 

	  *** Extract the coefficients and variance-covariance matrix
      matrix b = e(b)
      matrix V = e(V)
      local blength = colsof(b)

      *** Preserve the dataset, take a single multivariate normal draw from the
      *** posterior distribution of the coefficients, and restore the dataset.
      *** We use the capture command to catch possible errors in drawnorm
      *** and drop these iterations gracefully.
      preserve
      capture quietly drawnorm b1-b`blength', double n(1) means(b) cov(V) clear
      if _rc == 0 {
          mkmat b1-b`blength', matrix(bsample)
          matrix posterior = nullmat(posterior) \ bsample
                   }
        else {
             display "Error drawing sample...iteration dropped"
             }
			 
        restore  
	 } 
*

*** Create variables from matrix
svmat posterior

*** Calculate means and standard deviations
* -> NOTE: These are the results found in table 2 (model 3, BJPS manuscript)
tabstat posterior*, stat(mean sd) 

*** Reset matrix size (not necessary to execute code but here if you need to reset the matrix size)
matrix drop _all

*Model 6, BJPS manuscript (Table 2)
* -> original Egorov et al., model 5 in table 1 (pg. 658) but replace Freedom 
*    House data w/ MSF point estimates
* -> runs model 750 times with values randomly pulled from the MSF lead distribution

*Load Data
clear all
cd " "
use bjps_msf_rep-egorov
set more off

*Set data for panel analysis
sort cow year
iis cow

*Run model to mark observations in Egorov et al.'s original sample for comparability
quietly xtreg mflead logoilresvalue loggdppcppp logpoptotal loggovgdp _Iyear* ///
 if year>=1992 &  polity292<6, fe 
*Identify sample
gen byte m5=e(sample)

*Generate 750 random draws from msf lead distribution *
* * * * * * * * * * * * * * * * * * * * * * * * * * * *

*Set seed for reproducibility
set seed 17893

*Loop for 750 draws
forvalues i = 1/750 {
gen x`i'=rnormal(msflead,sdlead)
}
*

*****************************
**** Prep for the Monte Carlo
*****************************

      set more off
      set matsize 750
	  
      *** Note that the matsize must be at least as large as the larger
      *** dimension in your posterior matrix. Since we use only 750 draws,
	  *** setting the matsize to 750 is sufficient for Stata IC, MP, and SE.
	  *** Executing a model with more draws where the dimensions rise above 800
	  *** will require a Stata more powerful than IC.
	  
      *** Run the Monte Carlo
      forvalues i = 1/750 {
      *** Print out an iteration number
      display `i'

      *** Fit the model, using the ith draw from the MSF lead posterior
	  quietly xtreg x`i' loggdppcppp logoilresvalue  logpoptotal loggovgdp _Iyear* ///
      if year>=1992 &  polity292<6 & m5==1, fe 

	  *** Extract the coefficients and variance-covariance matrix
      matrix b = e(b)
      matrix V = e(V)
      local blength = colsof(b)

      *** Preserve the dataset, take a single multivariate normal draw from the
      *** posterior distribution of the coefficients, and restore the dataset.
      *** We use the capture command to catch possible errors in drawnorm
      *** and drop these iterations gracefully.
      preserve
      capture quietly drawnorm b1-b`blength', double n(1) means(b) cov(V) clear
      if _rc == 0 {
          mkmat b1-b`blength', matrix(bsample)
          matrix posterior = nullmat(posterior) \ bsample
                 }
      else {
          display "Error drawing sample...iteration dropped"
           }
		   
        restore
	 } 
*

*** Create variables from matrix
svmat posterior

*** Calculate means and standard deviations
* -> NOTE: These are the results found in table 2 (model 6, BJPS manuscript)
tabstat posterior*, stat(mean sd)

*** Reset matrix size (not necessary to execute code but here if you need to reset the matrix size)
matrix drop _all
